Overview

Dataset statistics

Number of variables12
Number of observations5680
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory532.6 KiB
Average record size in memory96.0 B

Variable types

Numeric12

Alerts

df_index is highly correlated with customer_id and 2 other fieldsHigh correlation
gross_revenue is highly correlated with purchases_quantity and 4 other fieldsHigh correlation
recency_days is highly correlated with df_index and 2 other fieldsHigh correlation
purchases_quantity is highly correlated with gross_revenue and 4 other fieldsHigh correlation
basket_size is highly correlated with gross_revenue and 4 other fieldsHigh correlation
qt_products is highly correlated with gross_revenue and 3 other fieldsHigh correlation
max_recency is highly correlated with df_index and 2 other fieldsHigh correlation
qt_returns is highly correlated with gross_revenue and 4 other fieldsHigh correlation
purchased_returned_diff is highly correlated with gross_revenue and 2 other fieldsHigh correlation
frequency is highly correlated with purchases_quantityHigh correlation
customer_id is highly correlated with df_index and 2 other fieldsHigh correlation
avg_ticket is highly correlated with qt_returnsHigh correlation
gross_revenue is highly skewed (γ1 = 23.01864323) Skewed
basket_size is highly skewed (γ1 = 25.07383309) Skewed
avg_ticket is highly skewed (γ1 = 48.13887319) Skewed
qt_returns is highly skewed (γ1 = 29.86213869) Skewed
df_index is uniformly distributed Uniform
df_index has unique values Unique
customer_id has unique values Unique
qt_returns has 4190 (73.8%) zeros Zeros
purchased_returned_diff has 115 (2.0%) zeros Zeros

Reproduction

Analysis started2022-11-18 03:02:04.449903
Analysis finished2022-11-18 03:02:39.155706
Duration34.71 seconds
Software versionpandas-profiling v3.4.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
UNIFORM
UNIQUE

Distinct5680
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2888.285387
Minimum0
Maximum5770
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size44.5 KiB
2022-11-18T00:02:39.400724image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile288.95
Q11450.75
median2890.5
Q34329.25
95-th percentile5480.05
Maximum5770
Range5770
Interquartile range (IQR)2878.5

Descriptive statistics

Standard deviation1664.584059
Coefficient of variation (CV)0.5763225707
Kurtosis-1.196205237
Mean2888.285387
Median Absolute Deviation (MAD)1439.5
Skewness-0.003560571232
Sum16405461
Variance2770840.091
MonotonicityStrictly increasing
2022-11-18T00:02:39.571739image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
38791
 
< 0.1%
38551
 
< 0.1%
38541
 
< 0.1%
38531
 
< 0.1%
38521
 
< 0.1%
38511
 
< 0.1%
38501
 
< 0.1%
38491
 
< 0.1%
38481
 
< 0.1%
Other values (5670)5670
99.8%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
57701
< 0.1%
57691
< 0.1%
57681
< 0.1%
57671
< 0.1%
57661
< 0.1%
57651
< 0.1%
57641
< 0.1%
57631
< 0.1%
57621
< 0.1%
57611
< 0.1%

customer_id
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct5680
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16433.86796
Minimum12347
Maximum21997
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size44.5 KiB
2022-11-18T00:02:39.745752image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum12347
5-th percentile12702.95
Q114291.75
median16231
Q318212.25
95-th percentile21031.2
Maximum21997
Range9650
Interquartile range (IQR)3920.5

Descriptive statistics

Standard deviation2563.32051
Coefficient of variation (CV)0.1559779181
Kurtosis-0.8653935729
Mean16433.86796
Median Absolute Deviation (MAD)1960
Skewness0.3123861003
Sum93344370
Variance6570612.038
MonotonicityNot monotonic
2022-11-18T00:02:39.917766image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
178501
 
< 0.1%
150071
 
< 0.1%
203751
 
< 0.1%
203741
 
< 0.1%
155781
 
< 0.1%
124241
 
< 0.1%
203721
 
< 0.1%
178371
 
< 0.1%
203691
 
< 0.1%
143271
 
< 0.1%
Other values (5670)5670
99.8%
ValueCountFrequency (%)
123471
< 0.1%
123481
< 0.1%
123491
< 0.1%
123501
< 0.1%
123521
< 0.1%
123531
< 0.1%
123541
< 0.1%
123551
< 0.1%
123561
< 0.1%
123571
< 0.1%
ValueCountFrequency (%)
219971
< 0.1%
219961
< 0.1%
219951
< 0.1%
219941
< 0.1%
219931
< 0.1%
219921
< 0.1%
219881
< 0.1%
219871
< 0.1%
219841
< 0.1%
219831
< 0.1%

gross_revenue
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED

Distinct5438
Distinct (%)95.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1762.764634
Minimum0.42
Maximum279138.02
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size44.5 KiB
2022-11-18T00:02:40.083780image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0.42
5-th percentile13.3735
Q1237.2875
median613.93
Q31571.07
95-th percentile5307.991
Maximum279138.02
Range279137.6
Interquartile range (IQR)1333.7825

Descriptive statistics

Standard deviation7510.45918
Coefficient of variation (CV)4.260613718
Kurtosis699.5570111
Mean1762.764634
Median Absolute Deviation (MAD)479.46
Skewness23.01864323
Sum10012503.12
Variance56406997.1
MonotonicityNot monotonic
2022-11-18T00:02:40.241794image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7.959
 
0.2%
4.958
 
0.1%
1.258
 
0.1%
2.958
 
0.1%
3.757
 
0.1%
1.657
 
0.1%
12.757
 
0.1%
5.956
 
0.1%
4.256
 
0.1%
7.56
 
0.1%
Other values (5428)5608
98.7%
ValueCountFrequency (%)
0.421
 
< 0.1%
0.651
 
< 0.1%
0.791
 
< 0.1%
0.844
0.1%
0.853
 
0.1%
1.071
 
< 0.1%
1.258
0.1%
1.441
 
< 0.1%
1.657
0.1%
1.691
 
< 0.1%
ValueCountFrequency (%)
279138.021
< 0.1%
259657.31
< 0.1%
194550.791
< 0.1%
136275.721
< 0.1%
124564.531
< 0.1%
116729.631
< 0.1%
91062.381
< 0.1%
72882.091
< 0.1%
66653.561
< 0.1%
65039.621
< 0.1%

recency_days
Real number (ℝ≥0)

HIGH CORRELATION

Distinct304
Distinct (%)5.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean116.8264085
Minimum0
Maximum373
Zeros37
Zeros (%)0.7%
Negative0
Negative (%)0.0%
Memory size44.5 KiB
2022-11-18T00:02:40.464811image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3
Q122
median71
Q3199.25
95-th percentile338
Maximum373
Range373
Interquartile range (IQR)177.25

Descriptive statistics

Standard deviation111.6124711
Coefficient of variation (CV)0.9553702158
Kurtosis-0.640424192
Mean116.8264085
Median Absolute Deviation (MAD)61
Skewness0.8152565497
Sum663574
Variance12457.34369
MonotonicityNot monotonic
2022-11-18T00:02:40.629824image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1110
 
1.9%
4105
 
1.8%
398
 
1.7%
292
 
1.6%
1086
 
1.5%
882
 
1.4%
1779
 
1.4%
979
 
1.4%
777
 
1.4%
1566
 
1.2%
Other values (294)4806
84.6%
ValueCountFrequency (%)
037
 
0.7%
1110
1.9%
292
1.6%
398
1.7%
4105
1.8%
552
0.9%
777
1.4%
882
1.4%
979
1.4%
1086
1.5%
ValueCountFrequency (%)
37323
0.4%
37222
0.4%
37117
0.3%
3694
 
0.1%
36813
0.2%
36716
0.3%
36615
0.3%
36519
0.3%
36411
0.2%
3627
 
0.1%

purchases_quantity
Real number (ℝ≥0)

HIGH CORRELATION

Distinct57
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.47693662
Minimum1
Maximum206
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size44.5 KiB
2022-11-18T00:02:40.815840image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q34
95-th percentile11
Maximum206
Range205
Interquartile range (IQR)3

Descriptive statistics

Standard deviation6.81469284
Coefficient of variation (CV)1.959970395
Kurtosis300.1339365
Mean3.47693662
Median Absolute Deviation (MAD)0
Skewness13.14920397
Sum19749
Variance46.44003851
MonotonicityNot monotonic
2022-11-18T00:02:40.970852image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12858
50.3%
2823
 
14.5%
3503
 
8.9%
4394
 
6.9%
5237
 
4.2%
6173
 
3.0%
7138
 
2.4%
898
 
1.7%
970
 
1.2%
1154
 
1.0%
Other values (47)332
 
5.8%
ValueCountFrequency (%)
12858
50.3%
2823
 
14.5%
3503
 
8.9%
4394
 
6.9%
5237
 
4.2%
6173
 
3.0%
7138
 
2.4%
898
 
1.7%
970
 
1.2%
1054
 
1.0%
ValueCountFrequency (%)
2061
< 0.1%
1981
< 0.1%
1241
< 0.1%
971
< 0.1%
912
< 0.1%
861
< 0.1%
721
< 0.1%
622
< 0.1%
601
< 0.1%
571
< 0.1%

basket_size
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED

Distinct1841
Distinct (%)32.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean953.1683099
Minimum1
Maximum196844
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size44.5 KiB
2022-11-18T00:02:41.138865image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4.95
Q1106
median317.5
Q3805.25
95-th percentile2927.8
Maximum196844
Range196843
Interquartile range (IQR)699.25

Descriptive statistics

Standard deviation4194.233908
Coefficient of variation (CV)4.400307757
Kurtosis940.6390595
Mean953.1683099
Median Absolute Deviation (MAD)253.5
Skewness25.07383309
Sum5413996
Variance17591598.07
MonotonicityNot monotonic
2022-11-18T00:02:41.307880image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1114
 
2.0%
270
 
1.2%
351
 
0.9%
449
 
0.9%
535
 
0.6%
629
 
0.5%
1224
 
0.4%
7221
 
0.4%
8821
 
0.4%
720
 
0.4%
Other values (1831)5246
92.4%
ValueCountFrequency (%)
1114
2.0%
270
1.2%
351
0.9%
449
0.9%
535
 
0.6%
629
 
0.5%
720
 
0.4%
818
 
0.3%
97
 
0.1%
1017
 
0.3%
ValueCountFrequency (%)
1968441
< 0.1%
801791
< 0.1%
773731
< 0.1%
699931
< 0.1%
645491
< 0.1%
641241
< 0.1%
633121
< 0.1%
583431
< 0.1%
578721
< 0.1%
502551
< 0.1%

qt_products
Real number (ℝ≥0)

HIGH CORRELATION

Distinct529
Distinct (%)9.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean92.79137324
Minimum1
Maximum7838
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size44.5 KiB
2022-11-18T00:02:41.491895image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q114
median41
Q3107
95-th percentile333
Maximum7838
Range7837
Interquartile range (IQR)93

Descriptive statistics

Standard deviation210.4125806
Coefficient of variation (CV)2.267587743
Kurtosis508.0725752
Mean92.79137324
Median Absolute Deviation (MAD)33
Skewness17.6926127
Sum527055
Variance44273.45409
MonotonicityNot monotonic
2022-11-18T00:02:41.661243image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1254
 
4.5%
2147
 
2.6%
3108
 
1.9%
10100
 
1.8%
698
 
1.7%
992
 
1.6%
590
 
1.6%
486
 
1.5%
783
 
1.5%
1381
 
1.4%
Other values (519)4541
79.9%
ValueCountFrequency (%)
1254
4.5%
2147
2.6%
3108
1.9%
486
 
1.5%
590
 
1.6%
698
 
1.7%
783
 
1.5%
880
 
1.4%
992
 
1.6%
10100
 
1.8%
ValueCountFrequency (%)
78381
< 0.1%
55891
< 0.1%
50951
< 0.1%
45801
< 0.1%
26981
< 0.1%
23791
< 0.1%
20601
< 0.1%
18181
< 0.1%
16731
< 0.1%
16371
< 0.1%

avg_ticket
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED

Distinct5489
Distinct (%)96.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean31.17597255
Minimum0.42
Maximum13305.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size44.5 KiB
2022-11-18T00:02:41.834920image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0.42
5-th percentile3.460929841
Q17.948445596
median15.83608609
Q321.9193109
95-th percentile75.94814286
Maximum13305.5
Range13305.08
Interquartile range (IQR)13.9708653

Descriptive statistics

Standard deviation210.7379503
Coefficient of variation (CV)6.759627143
Kurtosis2843.811943
Mean31.17597255
Median Absolute Deviation (MAD)7.466538718
Skewness48.13887319
Sum177079.5241
Variance44410.48369
MonotonicityNot monotonic
2022-11-18T00:02:42.011937image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3.7511
 
0.2%
4.9510
 
0.2%
1.259
 
0.2%
2.959
 
0.2%
7.958
 
0.1%
12.757
 
0.1%
1.657
 
0.1%
8.257
 
0.1%
5.956
 
0.1%
4.156
 
0.1%
Other values (5479)5600
98.6%
ValueCountFrequency (%)
0.423
0.1%
0.5351
 
< 0.1%
0.651
 
< 0.1%
0.791
 
< 0.1%
0.83714285711
 
< 0.1%
0.842
< 0.1%
0.853
0.1%
1.0022222221
 
< 0.1%
1.021
 
< 0.1%
1.038751
 
< 0.1%
ValueCountFrequency (%)
13305.51
< 0.1%
4453.431
< 0.1%
38611
< 0.1%
3202.921
< 0.1%
30961
< 0.1%
1687.21
< 0.1%
1377.0777781
< 0.1%
1001.21
< 0.1%
952.98751
< 0.1%
931.51
< 0.1%

max_recency
Real number (ℝ≥0)

HIGH CORRELATION

Distinct365
Distinct (%)6.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean152.5357394
Minimum0
Maximum373
Zeros4
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size44.5 KiB
2022-11-18T00:02:42.178949image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile21
Q167.75
median135
Q3225
95-th percentile351
Maximum373
Range373
Interquartile range (IQR)157.25

Descriptive statistics

Standard deviation100.2286827
Coefficient of variation (CV)0.6570832713
Kurtosis-0.7720747076
Mean152.5357394
Median Absolute Deviation (MAD)75
Skewness0.5151050787
Sum866403
Variance10045.78883
MonotonicityNot monotonic
2022-11-18T00:02:42.332975image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
15444
 
0.8%
6343
 
0.8%
21340
 
0.7%
4237
 
0.7%
6435
 
0.6%
5335
 
0.6%
10135
 
0.6%
5034
 
0.6%
11934
 
0.6%
3533
 
0.6%
Other values (355)5310
93.5%
ValueCountFrequency (%)
04
 
0.1%
111
0.2%
27
 
0.1%
313
0.2%
418
0.3%
510
0.2%
714
0.2%
87
 
0.1%
916
0.3%
1023
0.4%
ValueCountFrequency (%)
37323
0.4%
37222
0.4%
37117
0.3%
3694
 
0.1%
36813
0.2%
36716
0.3%
36616
0.3%
36520
0.4%
36412
0.2%
3631
 
< 0.1%

qt_returns
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED
ZEROS

Distinct206
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean19.0931338
Minimum0
Maximum9360
Zeros4190
Zeros (%)73.8%
Negative0
Negative (%)0.0%
Memory size44.5 KiB
2022-11-18T00:02:42.509977image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile38
Maximum9360
Range9360
Interquartile range (IQR)1

Descriptive statistics

Standard deviation238.4156455
Coefficient of variation (CV)12.48698343
Kurtosis1036.930505
Mean19.0931338
Median Absolute Deviation (MAD)0
Skewness29.86213869
Sum108449
Variance56842.02003
MonotonicityNot monotonic
2022-11-18T00:02:43.024017image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
04190
73.8%
1169
 
3.0%
2148
 
2.6%
3105
 
1.8%
489
 
1.6%
678
 
1.4%
561
 
1.1%
1251
 
0.9%
744
 
0.8%
843
 
0.8%
Other values (196)702
 
12.4%
ValueCountFrequency (%)
04190
73.8%
1169
 
3.0%
2148
 
2.6%
3105
 
1.8%
489
 
1.6%
561
 
1.1%
678
 
1.4%
744
 
0.8%
843
 
0.8%
941
 
0.7%
ValueCountFrequency (%)
93601
< 0.1%
90141
< 0.1%
80041
< 0.1%
44271
< 0.1%
37681
< 0.1%
33311
< 0.1%
28781
< 0.1%
20221
< 0.1%
20121
< 0.1%
17761
< 0.1%

purchased_returned_diff
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct1851
Distinct (%)32.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.50240829
Minimum0
Maximum12.18870266
Zeros115
Zeros (%)2.0%
Negative0
Negative (%)0.0%
Memory size44.5 KiB
2022-11-18T00:02:43.189042image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1.386294361
Q14.65396035
median5.749392986
Q36.682421685
95-th percentile7.960323629
Maximum12.18870266
Range12.18870266
Interquartile range (IQR)2.028461334

Descriptive statistics

Standard deviation1.830438717
Coefficient of variation (CV)0.3326613766
Kurtosis1.184339164
Mean5.50240829
Median Absolute Deviation (MAD)1.005418397
Skewness-0.8326079357
Sum31253.67909
Variance3.350505896
MonotonicityNot monotonic
2022-11-18T00:02:43.361046image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0115
 
2.0%
0.693147180670
 
1.2%
1.09861228951
 
0.9%
1.38629436149
 
0.9%
1.60943791235
 
0.6%
1.79175946929
 
0.5%
2.4849066524
 
0.4%
4.47733681422
 
0.4%
4.27666611922
 
0.4%
1.94591014920
 
0.4%
Other values (1841)5243
92.3%
ValueCountFrequency (%)
0115
2.0%
0.693147180670
1.2%
1.09861228951
0.9%
1.38629436149
0.9%
1.60943791235
 
0.6%
1.79175946929
 
0.5%
1.94591014920
 
0.4%
2.07944154218
 
0.3%
2.1972245777
 
0.1%
2.30258509317
 
0.3%
ValueCountFrequency (%)
12.188702661
< 0.1%
11.250859161
< 0.1%
11.249584721
< 0.1%
11.142455811
< 0.1%
11.068573991
< 0.1%
11.05111221
< 0.1%
11.031788081
< 0.1%
10.968560291
< 0.1%
10.951034591
< 0.1%
10.80752351
< 0.1%

frequency
Real number (ℝ≥0)

HIGH CORRELATION

Distinct1225
Distinct (%)21.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.5464364876
Minimum0.005449591281
Maximum17
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size44.5 KiB
2022-11-18T00:02:43.540059image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0.005449591281
5-th percentile0.01102941176
Q10.02491103203
median1
Q31
95-th percentile1
Maximum17
Range16.99455041
Interquartile range (IQR)0.975088968

Descriptive statistics

Standard deviation0.5504765781
Coefficient of variation (CV)1.007393523
Kurtosis139.3150778
Mean0.5464364876
Median Absolute Deviation (MAD)0
Skewness4.869349083
Sum3103.759249
Variance0.303024463
MonotonicityNot monotonic
2022-11-18T00:02:43.700072image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12866
50.5%
247
 
0.8%
0.062517
 
0.3%
0.0277777777817
 
0.3%
0.0238095238116
 
0.3%
0.0833333333315
 
0.3%
0.0909090909115
 
0.3%
0.0294117647114
 
0.2%
0.0344827586214
 
0.2%
0.0212765957413
 
0.2%
Other values (1215)2646
46.6%
ValueCountFrequency (%)
0.0054495912811
 
< 0.1%
0.0054644808741
 
< 0.1%
0.0054794520551
 
< 0.1%
0.0054945054951
 
< 0.1%
0.0055865921792
< 0.1%
0.0056022408961
 
< 0.1%
0.0056179775282
< 0.1%
0.005665722381
 
< 0.1%
0.0056818181822
< 0.1%
0.0056980056983
0.1%
ValueCountFrequency (%)
171
 
< 0.1%
41
 
< 0.1%
35
 
0.1%
247
 
0.8%
1.1428571431
 
< 0.1%
12866
50.5%
0.751
 
< 0.1%
0.66666666673
 
0.1%
0.5508021391
 
< 0.1%
0.53083109921
 
< 0.1%

Interactions

2022-11-18T00:02:36.875520image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:10.801416image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:13.454632image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:15.641809image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:18.210016image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:20.583207image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:23.356431image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:26.088651image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:28.336843image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:30.207982image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:32.170948image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:34.323316image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:37.029536image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:11.155446image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:13.612643image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:15.801821image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:18.365027image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:20.789223image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:23.609452image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:26.487683image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:28.491843image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:30.399002image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:32.323153image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:34.921365image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:37.180546image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:11.339458image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:13.780657image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:15.970835image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:18.526039image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:21.081248image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:23.810467image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:26.643696image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:28.635857image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:30.549011image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:32.459167image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:35.085377image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:37.323858image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:11.514473image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:14.018674image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:16.118848image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:18.702055image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:21.399273image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:24.068488image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:26.855712image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:28.800868image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:30.695022image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:32.600176image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:35.332399image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:37.473571image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:11.748494image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:14.185688image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:16.276859image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:18.863068image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:21.668294image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:24.435518image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:27.056729image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:28.979884image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:30.851035image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:32.797191image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:35.617422image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:37.615580image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:11.904507image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:14.407705image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:16.425872image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:19.011080image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:21.828308image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:24.659535image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:27.240742image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:29.121894image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:31.007047image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:32.934206image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:35.771431image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:37.780596image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:12.132525image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:14.588724image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:16.630887image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:19.220097image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:22.005322image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:24.853549image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:27.427761image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:29.276906image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:31.177062image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:33.091214image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:35.930444image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:37.933606image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:12.366542image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:14.842741image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:16.953916image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:19.481118image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:22.167332image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:25.173578image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:27.586782image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:29.429921image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:31.336076image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:33.287232image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:36.110460image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:38.080617image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:12.590560image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:14.991755image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:17.262939image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:19.650133image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:22.313348image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:25.387595image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:27.743786image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:29.570931image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:31.546092image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:33.629260image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:36.296474image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:38.228631image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:12.938589image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:15.156770image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:17.427952image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:19.864148image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:22.545366image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:25.545609image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:27.890795image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:29.712944image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:31.712104image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:33.848278image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:36.445487image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:38.365640image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:13.112604image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:15.315781image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:17.796979image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:20.057163image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:22.781385image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:25.690617image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:28.038806image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:29.853954image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:31.868118image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:34.011301image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:36.584496image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:38.509652image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:13.272617image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:15.483796image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:18.004999image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:20.272181image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:23.066408image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:25.859633image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:28.186820image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:30.006967image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:32.023141image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:34.166302image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:02:36.726510image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Correlations

2022-11-18T00:02:43.854084image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Auto

The auto setting is an easily interpretable pairwise column metric of the following mapping: vartype-vartype : method, categorical-categorical : Cramer's V, numerical-categorical : Cramer's V (using a discretized numerical column), numerical-numerical : Spearman's ρ. This configuration uses the best suitable for each pair of columns.
2022-11-18T00:02:44.080114image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-11-18T00:02:44.318121image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-11-18T00:02:44.550140image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-11-18T00:02:44.791172image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-11-18T00:02:38.739673image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-11-18T00:02:39.027067image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexcustomer_idgross_revenuerecency_dayspurchases_quantitybasket_sizeqt_productsavg_ticketmax_recencyqt_returnspurchased_returned_difffrequency
00178505391.21000372.0000034.000001733.00000297.0000018.15222372.0000040.000007.4342617.00000
11130473232.5900056.000009.000001390.00000171.0000018.9040471.0000035.000007.211560.02830
22125836705.380002.0000015.000005028.00000232.0000028.9025073.0000050.000008.512780.04032
3313748948.2500095.000005.00000439.0000028.0000033.86607137.000000.000006.084500.01792
4415100876.00000333.000003.0000080.000003.00000292.00000333.0000022.000004.060440.07317
55152914623.3000025.0000014.000002102.00000102.0000045.3264778.0000029.000007.636750.04011
66146885630.870007.0000021.000003621.00000327.0000017.2197948.00000399.000008.077760.05722
77178095411.9100016.0000012.000002057.0000061.0000088.7198470.0000041.000007.608870.03352
881531160767.900000.0000091.0000038194.000002379.0000025.5434621.00000474.0000010.537950.24332
99160982005.6300087.000007.00000613.0000067.0000029.9347887.000000.000006.418360.02439

Last rows

df_indexcustomer_idgross_revenuerecency_dayspurchases_quantitybasket_sizeqt_productsavg_ticketmax_recencyqt_returnspurchased_returned_difffrequency
56705761219884839.420001.000001.000001074.0000062.0000078.055161.000000.000006.979151.00000
5671576213298360.000001.000001.0000096.000002.00000180.000001.000000.000004.564351.00000
5672576314569227.390001.000001.0000079.0000012.0000018.949171.000000.000004.369451.00000
567357642199217.900001.000001.0000014.000007.000002.557141.000000.000002.639061.00000
56745765219933.350001.000001.000002.000002.000001.675001.000000.000000.693151.00000
56755766219945699.000001.000001.000001747.00000634.000008.988961.000000.000007.465661.00000
56765767219956756.060000.000001.000002010.00000730.000009.254880.000000.000007.605891.00000
56775768219963217.200000.000001.00000654.0000059.0000054.528810.000000.000006.483111.00000
56785769219973950.720000.000001.00000731.00000217.0000018.206080.000000.000006.594411.00000
5679577012713794.550000.000001.00000505.0000037.0000021.474320.000000.000006.224561.00000